155 research outputs found
Visual Detection of Structural Changes in Time-Varying Graphs Using Persistent Homology
Topological data analysis is an emerging area in exploratory data analysis
and data mining. Its main tool, persistent homology, has become a popular
technique to study the structure of complex, high-dimensional data. In this
paper, we propose a novel method using persistent homology to quantify
structural changes in time-varying graphs. Specifically, we transform each
instance of the time-varying graph into metric spaces, extract topological
features using persistent homology, and compare those features over time. We
provide a visualization that assists in time-varying graph exploration and
helps to identify patterns of behavior within the data. To validate our
approach, we conduct several case studies on real world data sets and show how
our method can find cyclic patterns, deviations from those patterns, and
one-time events in time-varying graphs. We also examine whether
persistence-based similarity measure as a graph metric satisfies a set of
well-established, desirable properties for graph metrics
Persistent Homology Guided Force-Directed Graph Layouts
Graphs are commonly used to encode relationships among entities, yet their
abstractness makes them difficult to analyze. Node-link diagrams are popular
for drawing graphs, and force-directed layouts provide a flexible method for
node arrangements that use local relationships in an attempt to reveal the
global shape of the graph. However, clutter and overlap of unrelated structures
can lead to confusing graph visualizations. This paper leverages the persistent
homology features of an undirected graph as derived information for interactive
manipulation of force-directed layouts. We first discuss how to efficiently
extract 0-dimensional persistent homology features from both weighted and
unweighted undirected graphs. We then introduce the interactive persistence
barcode used to manipulate the force-directed graph layout. In particular, the
user adds and removes contracting and repulsing forces generated by the
persistent homology features, eventually selecting the set of persistent
homology features that most improve the layout. Finally, we demonstrate the
utility of our approach across a variety of synthetic and real datasets
Runaway Feedback Loops in Predictive Policing
Predictive policing systems are increasingly used to determine how to
allocate police across a city in order to best prevent crime. Discovered crime
data (e.g., arrest counts) are used to help update the model, and the process
is repeated. Such systems have been empirically shown to be susceptible to
runaway feedback loops, where police are repeatedly sent back to the same
neighborhoods regardless of the true crime rate.
In response, we develop a mathematical model of predictive policing that
proves why this feedback loop occurs, show empirically that this model exhibits
such problems, and demonstrate how to change the inputs to a predictive
policing system (in a black-box manner) so the runaway feedback loop does not
occur, allowing the true crime rate to be learned. Our results are
quantitative: we can establish a link (in our model) between the degree to
which runaway feedback causes problems and the disparity in crime rates between
areas. Moreover, we can also demonstrate the way in which \emph{reported}
incidents of crime (those reported by residents) and \emph{discovered}
incidents of crime (i.e. those directly observed by police officers dispatched
as a result of the predictive policing algorithm) interact: in brief, while
reported incidents can attenuate the degree of runaway feedback, they cannot
entirely remove it without the interventions we suggest.Comment: Extended version accepted to the 1st Conference on Fairness,
Accountability and Transparency, 2018. Adds further treatment of reported as
well as discovered incident
Querying and creating visualizations by analogy
Journal ArticleWhile there have been advances in visualization systems, particularly in multi-view visualizations and visual exploration, the process of building visualizations remains a major bottleneck in data exploration. We show that provenance metadata collected during the creation of pipelines can be reused to suggest similar content in related visualizations and guide semi-automated changes. We introduce the idea of query-by-example in the context of an ensemble of visualizations, and the use of analogies as first-class operations in a system to guide scalable interactions. We describe an implementation of these techniques in VisTrails, a publicly-available, open-source system
A Vector Field Design Approach to Animated Transitions
Animated transitions can be effective in explaining and exploring a small number of visualizations where there are drastic changes in the scene over a short interval of time. This is especially true if data elements cannot be visually distinguished by other means. Current research in animated transitions has mainly focused on linear transitions (all elements follow straight line paths) or enhancing coordinated motion through bundling of linear trajectories. In this paper, we introduce animated transition design, a technique to build smooth, non-linear transitions for clustered data with either minimal or no user involvement. The technique is flexible and simple to implement, and has the additional advantage that it explicitly enhances coordinated motion and can avoid crowding, which are both important factors to support object tracking in a scene. We investigate its usability, provide preliminary evidence for the effectiveness of this technique through metric evaluations and user study and discuss limitations and future directions
Problems with Shapley-value-based explanations as feature importance measures
Game-theoretic formulations of feature importance have become popular as a
way to "explain" machine learning models. These methods define a cooperative
game between the features of a model and distribute influence among these input
elements using some form of the game's unique Shapley values. Justification for
these methods rests on two pillars: their desirable mathematical properties,
and their applicability to specific motivations for explanations. We show that
mathematical problems arise when Shapley values are used for feature importance
and that the solutions to mitigate these necessarily induce further complexity,
such as the need for causal reasoning. We also draw on additional literature to
argue that Shapley values do not provide explanations which suit human-centric
goals of explainability.Comment: Accepted to ICML 202
- …